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of Proportional Reasoning Tasks 



Abstract 

Educators and psychologists arc increasingly interested in modelling the processes 
and knowledge structures by which people learn and solve problems. Progress has been 
made in developing cognitive models in several domains, and in devising observational 
settings that provide clues about subjects' cognition fiom this perspective. Less attention 
has been paid to procedures for inference or decision-making with such mformation, given 
that it provides only imperfect information about cognition— in shon, test theory for 
cognitive assessment. This paper describes probability-based inference in this context, and 
illustrates its application with an example concerning proportional reasoning. 

Key words: Bayesian inference, cognitive assessment, inference networks, multiple 
strategics, proportional reasoning, test theory 
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Introduction 

The view of human learning rapidly emerging from cognitive and educational 
psychology emphasizes the active, constructive role of the learner in acquiring knowledge. 
Learners become more competent not simply by learning more facts and skills, but by 
configuring and reconfiguring their knowledge; by automating procedures and chunking 
information to reduce memory loads; and by developing models and strategies that help 
them discern when and how facts and skills are relevant Educators have begun to view 
school learning fi-om this perspective, as a foundation for instruction in both the classroom 
and intelligent computer-assisted instruction, or intelligent tutoring systems (TTSs). 
Making educational decisions rast in this framework requires information about students in 
the same terms. Glaser, Lesgold, and Lajoie state, 

Achievement testing as we have defmed it is a method of indexing stages of 
competence through indicators of the level of development of knowledge, 
skill, and cognitive process. These indicators display stages of performance 
that have been attained and on which further learning can proceed. They 
also show forms of error and misconceptions in knowledge that result in 
inefficient and incomplete knowledge and skill, and that need instructional 
attention. (Glaser, Usgold, & Lajoie, 1987, 81) 

Standard test Uieoiy is designed to characterize students in terms of their tendencies 
to make correct answers, not in terms of their skills, strategies, and knowledge structures. 
Yet generalizations of the questions that led to standard test theory arise immediately in the 
context Glaser and his coUcagues describe; How can we design efficient observational 
settings to gather die data we need? How can we make and justify decisions? How do we 
evaluate and improve the quality of our efforts? Without a conceptual fi^work for 
inference, rigorous answers to tiiese questions are not forthcoming. 

This presentation addresses issues in model building and statistical inference in die 
context of student modelling. The statistical framework is that of inference networics (e.g , 
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Pearl, 1988; Andreassen, Jensen, & Olcsen» 1990). Ideas are demonstrated with data from 
a test of proportional reasoning, based on work by Noelting (1980a, 1980b). The 
observed data are subjects' comparisons of mixtures of juice and water, and their 
explanations of the strategies by which they arrived at their answers. The cognitive 
framework builds on Btod's (1989) structural analysis of the task component 
relationships involved in their solution strategies. 

Probability-based Inference in Cognitive Assessment 

Comparing the ways experts and novices solve problems in domains such as 
physics and chess (e.g., Chi, Feltovich & Glaser, 1981) reveals the central importance of 
knowledge structures — interconnected networks of concepts referred to as "frames" 
(Minsky, 1975) or "schemas" (Rumelhart, 1980)— that impart meaning to observations and 
actions. The process of learning is, to a large degree, expanding these structures and, 
importantly, reconfiguring ;hem to incorporate new and qualitatively different connections 
as the level of understanding deepens. Researchers in science and mathematics education 
have focused on identifying key concepts and schemas in these content areas, studying 
how tiiey are typically acquired (e.g., in mechanics, Clement, 1982; in proportional 
reasoning, Karplus, Pulos, & Stage, 1983), and constructing observational settings in 
which students' understandings can be infeired (c.g., van den Heuvel, 1990; McDeimott, 
1984). A key feature of most of these studies is explaining patterns observed in learners' 
problem-solving behavior in terms of their knowledge structures. Riley, Greeno, and 
Heller (1983), for example, explain typical patterns of errors and correct answers in 
children's word problems in terms of a hierarchy of successively sophisticated procedural 
models. 

Once the relevance of states of understanding to instructional decisions is accepted, 
one immediately confronts the fact that these states cannot be ascertained with certainty; 
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they can be inferred only imperfectly from observauons of the students* behavior. 
Research in subject areas is beginning to provide observational situations (at theii' simplest 
form, test items) that tap particular aspects of knowledge structures (e.g., Lcsh, Landau, & 
Hamilton, 1983; Marshall, 1989). Conformable statistical models must be capable of 
expressing the nature and the strength of evidence that observations convey about 
knowledge structures. Two kinds of variables are thus involved; those expressing 
characteristics of an inherently unobservable student model, and those concerning q» jties 
of observable student behavior, the latter of which presumably carry information about the 
former. 

For the special case in which a student is adequately characterized by a single 
unobservable proficiency variable, a suitable statistical methodology has been developed 
within the paradigm of standard test theory, most notably under the rubric of item response 
theory (LKT; see Hambleton, 1989). IRT posits a model for the probability of a correct 
response to a given test item, as a function of parameters for the examinee's proficiency 
(often denoted 0) and measurement properties of the item. The IRT nnodel provides the 
structure through which observable responses to test items are related to one another and to 
the unobservable proficiency variables. Item parameters specify the degree or strength of 
relationships within that structure, by quantifying the conditional probabilities of item 
responses given G. Observed item responses induce a likelihood function for 6, opening 
the door to statistical inference and decision-making models. The coupling of probability- 
based inference with a simple student model for overall proficiency provides the foundation 
for item development, test construction, adaptive testing, test equating, and validity 
research—all providing, of course, that "overall proficiency" is sufficient for the job at 
hand. 

Models connecting observations with a broader array of cognitivcly-motivated 
unobservable variables have begun to appear in the psychometric literatui"e. Table 1 offers 
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a sampling. The approach we have begun to foUow continues in the same spirit In any 
given implementation, the character of unobscrvablc variables and the structure of their 
int>..Tclationships is derived from the structure and the psychology of the substantive area, 
with tl"; goal of capturing key distinctions among students. Probability distributions 
charact^.fizc the likelihoods of potential observable variables, given values of the variables 
in the unobservable student model. The relationship of the observable variables to the 
unobservable variables characterizes the nature and amount of information they carry, 

[Insert Table 1 about here] 

Of particular importance is the concept of conditional independence: a set of 
variables may be inteirelated in a population, but independent given the values of another 
set of variables. In cognitive models, relationships among observed variables are 
"explained" by inherentiy unobservable, or latent, variables. Pearl (1988) argues that 
creating such intervening variables is not merely a technical convenience, but a natural 
element in human reasoning: 

. .conditional independence is not a grace of nature for which wc must wait 
passively, but rather a psychological necessity which we satisfy actively by 
organizing our knowledge in a specific way. An important tool in such 
organization is the identification of intermediate variables that induce 
conditional independence among observables; if such variables arc not in 
our vocabulary, we create them. In medical diagnosis, for instance, when 
some symptoms directiy influence one another, the medical profession 
invents a name for that interaction (e.g., 'syndrome,' 'complication,' 
•pathological state') and treats it as a new auxiliary variable that induces 
conditional independence; dependency between any two interacting systems 
is fiilly attributed to the dependencies of each on the auxiliary variable." 
(Pearl, 1988, p. 44) 
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Inference Networks 



A heritage of statistical inference under the paradigm described above extends back 
beyond IRT, to Charles Spearman's (e.g., 1907) early work with latent variables, Sewell 
Wright's (1934) path analysis, and Paul I-azarsfeld's (1950) latent class models. The 
resemblance of the inference networks presented below to LISREL diagrams (Jftrcskog & 
S6rbom, 1989) is no accident! The inferential logic of test theory is built around 
conditional probability relationships— specifically, probabilities of observable variables 
given theoretically-motivated unobservable variables. 

The starting point is a recursive representation of the joint distribution of a set of 
random variables; that is. 



where the term for j=l is defined as simply p(Xi). A recursive representation can be 
written for any ordering of the variables, but one that exploits conditional independence 
relationships can be more useful. For example, under an IRT model with one latent 
proficiency variable 9 and three test items, Xi, X2, and X3, it is equuii> valid to write 




(1) 



p(Xi,X2,X3,e) = p(eiX3,X2.Xi) p(X3!X2,x. . f ^ ^^iXi) p(Xi) 



(2) 



or 



p(Xi,X2,X3,e) = p(X3iX2,Xi,e) p(X2iXi,e) p(Xiie) pO) . 



(3) 



But (3) simplifies to 



P(Xi,X2.X3,e) « p(X3ie) p(X2ie) p(Xiie) pO) , 



(4) 
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the form that harnesses the posver of IRT by expressing test perfoimancc as the 
concatenation of conditionaUy independent item performances. More generally, (1) can be 
re-written as 

p(Xi,...,Xn) = XI pCX/'parents of X/') , 

(5) 

where {parents of Xj) is the subset of variables upon which Xj is directly dependent. 

Corresponding to the algebraic representation of p(Xi,...,Xn) in (5) is a graphical 
representation- a directed acyclic graph (DAG). Each variable is a node in the graph; 
directed arrows nin from parents to children, indicating conditional dependence 
relationships among the variables. In this paper we refer to such a structure or its graphical 
representation as an irference network. Figure 1 shows the DAGs that correspond to (2) 
and (4) in the IRT example. Note that the simpUfied structure is apparent only in the graph 
for (4). A DAG does not generally reveal conditional independence relationships that might 
arise under alternative orderings of the variables. 

[Insert Figure 1 about here] 
Different fields of application emphasize different aspects of inference network 
representations of systems of variables. In factor analyses of mental tests, for example, 
one important objective is to find a "simple strucnire" representation of the relationships 
among test scores, wherein each test has only a few latent variables as parents (e.g., 
TTiurstone, 1947). In sociological and economic applications, path analysis is used to sort 
out the direct and indirect effects of selected variables upon others (e.g., Blalock, 1971). 
In animal husbandry, where genotypes are latent nodes and inherited characteristics of 
animals are observable, interest lies in the predicted distribution of characteristics of the 
offspring of potential matings (e.g., Hilden, 1970). In medical diagnosis, disease states 
and syndromes are unobserved nodes, while symptoms and test results are potential 
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observables; ascertaining the latter guides diagnosis and treatment decisions (e.g., 
Andrcassen, Jensen, & Olesen, 1990). 

The latter arenas have sparked interest in calculating distributions of remaining 
variables conditional on observed values of a subset If the topology of the DAG is 
favorable, such calculations can be carried out in real time in large systems by means of 
local operations on small subsets of interrelated variables ("cliques") and their intersections. 
Tne interested reader is referred to Lauritzcn and Spiegelhalter (1988), Pearl (1988), and 
Shafer and Shenoy (1988) for updating strategies, a kind of generalization of Bayes 
theorem. The calculations for the following example were carried out with Andersen, 
Jensen, Olesen, and Jensen's (1989) HUGIN computer program. 

The point of this presentation is that inference networks can be constructed around 
cognitive smdent models. The analogy to medical applications is sketched in Table 2. A 
key aspect of the correspondence is the flow of diagnostic reasoning: Theory is expressed 
in terms of conditional probabilities of observations given theoretically suggested 
unobservable variables, and it is from this direction that the inference network is 
constmaed. Reasoning in practical applications flows in the opposite direction, as 
evidence from observations is absorbed, to update belief about the unobservable variables. 
This necessity of bidirectional reasoning stimulates interest in probability-based inference, 
as accomplished by the generalizations of Bayes Theorem mentioned above. 

[Insert Table 2 about here] 

An Inference Network for a Set of Juice-Mixing Tasks 
Proportional reasoning is a topic of great current interest among mathematics and science 
educators, because it constitutes perhaps half of die middle school matiiematics curriculum, and is 
a prerequisite for quantitative aspects of the sciences as well as advanced topics in mathematics. 
There is consequently considerable research on this topic among the communities of both 
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developmental psychology (e.g., Inhelder & i'iagct. 1958; Sieglcr, 1978) and the psychology of 
mathematics education (e.g., Romberg, Lamon, & Zarinnia, 1988). The network presented here is 
based on a program of research on the development of proportional reasoning represented by 
Nocliing (1980a; 1980b) and B^land (1990). Accoixling to this conceptual frameworic, subjects' 
cogrutive strategies are explained ii. terms of the relationships they address vis a vis the structural 
properties of the items. Development is viewed as a progression through quaUtatively distinct 
levds of understanding. 

In order to study the concept of proportion, a basic test of twenty items was 
devised. Each consisted of predicting the relative taste of two drinks, labeled A and B, 
which comprised varying numbers of glasses of juice and glasses of water. Each mixture 
defined an ordered pair, that is (o, b) for the drink labeled A, and (c, d) for the drink 
labeled B. The first term in each pair defined the number of glasses of juice and the second 
term defined the number of glasses of water, as shown by the example in Figure 2. In the 
test, the child had to decide if either A or B would taste juicier, or if both drinks would taste 
the same. The subjects also had to explain the reasons for their choices by writing a 
detailed explanation of how they had solved each problem. A total number of 448 subjects, 
ranging fix)m fourth graders to university fiieshman, were assessed. Instructions were 
given and data collected in class groups. The order of item presentation was randomized 
for each child. To assure that the task was understood, sample items were solved by the 
classes. 

[Insert Figure 2 about here] 
An item's components were differentiated as being the varying quantities of juice glasses, 
which defined the attribute, and water glasses, which defined the complement, in each pair. When 
a subject attempted to solve an item by constructing transformations between similar terms in both 
pairs, that is, cither between the attribute or the complement in both mixtures, then the relationships 
were described as scalar. On the other hand, when the transformations were constructed between 
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the complennentary ternis within each pair, that is, between the attribute and the complement in a 
mixture, then the relationships were described as fiinctional. Three qualitatively distinct ordered 
levels Gisted below) were defined as a set of additive and multiplicative relations anwng the values 
of these terms. These levels characterize both items and solution strategies: solution strategies, in 
terms of the kinds of transformations and comparisons they involve; items, by virtue of their 
structure, in tenns of the minimal level required for a correct understanding of Uie problenL The 
fact that some strategies led to success with items at one level, but to failure with items at higher 
levels, indicates a structural discontinuity between these levels. This implies that die transition 
between these levels involves restructuring, or reconcepnializing, die relationships among task 
components, in response to die structural properties of the items. The tiiree levels of 
understanding are as follows. 

• Level 1 , the preoperational level, is characterized by the differentiation arid 
coordination of scalar and functional relationships. For example, one justification 
for solving the item (2,1) vs. (3,4) was: "Mixture A tastes juicier because die 
number of juice glasses is greater tiian the number of water glasses. By 
comparison, mixture B tastes less juicy because die quantity of water glasses is 
greater than juice glasses." 

• Level 2, the concrete operational level, is characterized by the construction of an 
equivalence class. For example, to solve die item (2,6) vs. (3,9), die typical 
justification for the functional operator was: "Bodi drinks taste alike because diere is 
o:ie glass of juice for diree glasses of water, which defines die ratio 1:3 in botfi 
pairs." 

• Level 3, the formal operational level, is characterized by die construction of a 
combinatorial system, building upon die concepts from die previous levels. An 
item is solved cidier by die between state ratios (common denominator) or die 
within state ratios (percentage). For example, when a ratio strategy was used to 

O If) 
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solve (3,5) vs. (2,3), the typical justification was: "In Mixture A there are three 
glasses of juice for five glasses of water, a ratio of 9: 15. In Mixture B the ratio is 
10: 15 juice to water. Therefore, B tastes juicier." 

The gradual extension of these structures, through exercise and practice, leads to the 
consolidation of the cognitive strategies as they are applied to solve the increasing complexity of 
the items within a level. This progression was defined as stage within level. Three successive 
stages, denoted as a, b, and c, were defined within each level. Table 3 summarizes the stages 
within levels. The reader is referred to B^land (1990) for additional detail and discussion. 

[Lnsert Table 3 about here] 

An Overview of the Network 

An inference network was constructed on the basis of the data described above, 
addressing subjects' optimal cognitive stage x level, or the highest stage and level &l which 
they were observed to perform during the course of observation, and die details of tiieir 
responses to three items, one at each level. This section introduces the network. The 
following section describes the variables in more detail, and discusses the specification of 
conditional probabilities. The section after that gives examples of reasoning from 
observations back to cognitive levels. 

The network addresses the three items shown in Figure 3, which appeared as 3, 8, 
and 17 in the master list Item 3, (2,1) vs. (3,4), is a level 1 item, since it can be correctly 
solved by a level 1 strategy: Mixture A has more juice than water, while B has more water 
tiian juice. Item 8, (2,6) vs. (3,9), is a level 2 item, since it requires the construction of an 
equivalence class. Item 17, (3,5) vs. (2,3), is a level 3 item, since a solution tiiat correctly 
attends to its structure must, in some way, compare ratios. 

[Insert Figure 3 about here] 
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The 21 variables in the netwoik are listed below, with the number of possible 
values each variable can take in parentheses. Detailed descriptions appear in the following 
section. 

X 1 Optimal cognitive level (3). 
X2 Stage within optimal level (3). 
X3 Optimal stage x level (9). 

X4j Strategy employed on Item j, for j=3, 8, and 17 (10 per item). 

Xsj Procedural analysis for Item j (4 per item). 

X6j Understanding of structure of Item j (2 per item). 

Xvj Solution of Item j (2 per item). 

Xsj Response choice on Item j (3 per item). 

X9j Objective corrccmess of response choice on Item j (2 per item). 
Without constraints, the joint distribution of the variables listed above would be 
specified as a probabUity for each of the 3x3x9x( 10x4x3x2x2x2)3 possible combinations 
of values— about 7x10^0 of them. Under the assumed networic, however, 

p(Xi,X2,X3,X4,3,X4.8.X4,i7 X9,3,X9,8,X9,i7) 

= p(Xi)p(X2lXi)p(X3lX2,Xi) 

xH P(X4jlX3)p(X5jlX4j)p(X6jlX5j)p(X7jlX5j)p(X8jlX5j,X4j)p(X9jlX8j) . 

j (6) 
As examples, (6) implies conditional independence of item responses, X4,3. X4,8, and X4,i7» 
given a subject's optimal cognitive stage x level, X3 (altiiough we discuss below relaxing this 
assumption to account for processes that characterize tiie adaptive quality of children's strategy 
choices during die course of testing); and conditional independence of the coirecmess of the 
response choice for Item j, X9j, from all other variables given the identity of tiiat response choice, 
Xgj. The most complex of these local relationships in (6) involves only three variables, and the 
total number of distinct probabilities needed to approximate the full joint distribution is 3+9+81+ 
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3(90+40+120+8+8+6), or 909. As we shall see, many of these relationships arc logical rather 
than empirical, and can be specified without recourse to data. 

Figure 4 is the DAG corresponding to (6). Figure 5 is a similar graph from HUGIN, 
exhibiting for each node the baseline marginal distribution for each variable with bars representing 
the probabiUties for each potential value of a variable. These population base rates were 
cstabUshed fit>m the responses of aU subjects, as described in the next section. Figure 5 represents 
the state of knowledge one would have as a new subject from the same population is introduced. 
As she makes responses, the relevant nodes will be updated to reflect certain knowledge of, say, 
the coirectness of a response or the strategy level used to justify it. This would be represented by a 
probabiUty bar extending aU the way to one for the observed value. This infomiation updates (still 
imperfect) knowledge about her optimal cognitive level, and expectations about what might be 
observed on subsequent items. 

[Insert Figures 4 and 5 about here] 

Instantiating the Network 

The initial status of the network is the joint distribution of all the variables. It is specified 
via (6) in terms of the baseline distribution of any variables without parents, and the conditional 
distributions of each of the remaining variables given its parents. Brand's classifications of all 
response explanations of all subjects into stage x level categories were employed, and treated as 
known with certainty.^ Explanations of the variables and discussions of the conditional 
probabilities associated with each follow. 



were 



^ A small proportion of the response strategies could not be classified, because subjects' explanations 
cither omitted or incomprehensible. These responses were not useful in dctennining a subject's highest 
strategy level, but they were included in the following analyses, with ♦'undiffcrenUated" as a potential value 
of strategy choice. TTie proportions for Items 3, 8, and 17 were 2%, 1%, and 1 1% respectively. 
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Xl ! Oprimal cngnirive level . Each subject was classified as to the stage and level of 
his or her highest level solution strategy, based on Brand's analyses of all twenty of their 
response explanations. Xi denotes their highest level, coUapsing over stages within levels. 
Because it has no parents, we need specify only population proportions: .08 for Level 1, 
.45 for Level 2, and .47 for Level 3. 

yo; Stage within oprimal level . X2 breaks down Stage membership within levels, so Xi is 
its parent Empirical proportions were employed, leading to the values shown in Table 4. Again 
these values are based on B61and's classification. Among the subjects whose highest observed 
level of solution strategy was Level 2, for example, what proportions of these highest strategies 
were at Stages a, b, and c of Level 2? Stages are meaningful only within levels, so the mjirginal 
distribution of X2 that appears in Figure 5 is not very useful. If Xi were fixed at a particular value 
of level, however, the resulting marginal distribution for X2 would be meaningful, taking the 
values from the appropriate row of Table 4. 

[Insert Table 4 about here] 

Qnrimal stage x level . X3 is the detailed categorization of subjects into mutually 
exclusive and exhaustive categories, in terms of levels and stages. It has as parents both level, Xi, 
and stage within level, X2. The specification of conditional probabilities under this arrangement is 
logical rather than empirical: The conditional probability of a given stage-within-level value is 1 
only if Xi and X2 take tiie appropriate values; otherwise, tiie conditional probability is zero. This 
can be seen in Figure 6, where conditioning on X3-3b leads to probabilities of one for Level=:3 
and Stage-witiiin-level=b. Actually no information would be lost by having Xi and X2 but not X3 
in die model, or X3 but not Xi and X2. We have included all of them for interpretive convenience; 
for example, Xi is useful for summarizing the "level" information in X3, whereas the values for 
X3 lie at the same level of detail as tiiose of the Item Strategy variables described below. 
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Under the "dialectical constructivist" developmental model sketched above, a subject's 
optimal structure level defines the repertoire of strategies available for solving a given item, as 
constructed through the changes and transformations that the subjects generated during the course 
of testing. That is, the optimal state of undentanding was constructed by the learners through a 
series of mental operations that defined the successive levels of conceptualization elaborated to seek 
the structural properties of the item. Consequendy, the optimal structure was not necessarily 
operationalized before the subjects undertook the uisk. The dynamics of this process are not 
modeUed in the present example, but will be discussed below. Conversely, the strategy required to 
solve a given problem was not ultimately at the same level as the subject's optimal stage x level, 
even when that level has been attained. This observation is taken into account in the present 
model, through the conditional probabiUty matrices for the following item strategy variables. 

[Insert Figure 6 about here] 

Sff^tgev gmplpved on Jtffm i (j=3, 8, 17). in addition to subjects' optimal strategy 
stage X level, the particular strategies they employed in the three exemplar items were classified 
according to stage x level, constituting the variables X4j. The additional value, abbreviated "Ud" 
in the HTJGIN diagrams, stands for "Undifferentiated;" these arc the responses which could not be 
classified. The X4j variables are modelled as conditionally independent, given their common 
parent X3, optimal cognitive level. The conditional probability matrices arc presented in Table 5. 
The following features are noting: 

With a few exceptions, a strategy at any level could be applied to any item. A smaU 
number of "logical zeros" appear when tht conceptual elements in a given strategy 
class had no possible couespondents in the structure of an item (e.g., a 2b strategy 
for Item 17). 
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• The entire upper right triangle of each matrix is filled with "logical zeros." By 
definition, it is not possible to observe a response strategy at a higher stage x level 
than a subject's optimal stage x level. 

• The lower left triangle of each matrix was estimated empirically for the most part» 
by simply entering the proportion of subjects classified in a given optimal stage x 
level who were classified as employing each of the response strategies for a given 
item. Probabilities that were logically possible but empirically zero were replaced 
by small positive probabilities. It can be seen that considerable variation in strategy 
choice on a given item often existed among subjects with the same optimal level. 
Among subjects whose optimal stage x level was 3b, for example, about half 
employed this powerful strategy for the more simply structured Item 8, while about 
40% adapted their strategies to the sructure of the item and employed a "minimally 
sufficient" strategy at level 2b. This information appears graphically in Figure 6. 

[Insert Table 5 about here] 

XK 'y .1-ocedural analysis for Item i . These variables summarize the results of the 
matchups between cognitive strategies and qualitative outcomes. The four possible values 
are "Success," in which a strategy at the same level as (isomorphic to) the item, or higher, 
was successfully employed; "Strategic eiror," in which a strategy was employed which 
failed to account for the item's structure; "Tactical error," in which a strategy appropriate to 
the item structure was employed but not successfully executed; and "Computational error," 
in which the attempt would have been a "Success" except for an error in numerical 
calculations. The respective X4j variables are the parents. Conditional probabilities 
corresponding to "Strategic error" are logical, since this outcome is necessary if a strategy 
that is insufficient vis a vis the item structure is applied, and impossible if a sufficient 
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strategy is applied.^ In the latter case, conditional probabilities are apportioned among 
"Success," "Tactical error," and "Computational error." Table 6 lists the conditional 
probability values. 

[Insert Table 6 about here] 

Xaj: Understanding of stni cnire of Item j . These variables simply collapse from 
their parents, the Xsjs, into the dichotomy of "Understood" or "Misunderstood" the 
structural prr perties of the item. In each case, the conditional probability matrix is logical: 
the probability for "Understood" is one if the procedural analysis is "Success," "Tactical 
error," or "Computational enor," and zero otherwise; the probability for "Misunderstood" 
is one if the procedural analysis is "Strategic error," and zero otherwise. 

Xiy Solution of Item j. Each of these variables is an alternative collapsing of the 
conresponding Xsj, into the dichotomy of "Succeed" or "Failed." Tailed" occurs if the 
procedural analysis takes the value of "Strategic error," "Tactical error," or "Computational 
error." "Success" signifies a correct response through an appropriate strategy. 

Xrj: Response choice on Item }. These variables are the actual values of subjects* 
response choices: Mixture A juicier, Mixture B juicier, or equal. The parents of Xgj arc 
X4j, strategy, and Xsj, procedural analysis. That is, conditional on a particular choice of 
strategy and the way it is applied on a given item, what arc the probabilities of each of the 
three potential response choices? Table 7 gives the conditional probability table for Item 17 
as an example. Recall that whenever a strategy level is insufficient for an item's structure, 
that strategy level for X4j and "Success" for Xsj cannot co-occur. This fact is accounted 
for in the conditional probability matrix for Xsj given X4j, so the corresponding row in Xgj 



^ One exception: two distinct strategics are classified as lb; one is appropriate for Item 3 but the other is 
not. 
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is moot. Entries of equal probabilities appear as spaccholders. Other combinations that 
were not logically impossible but which few or no subjects exhibited were assigned 
conditional probabilities that reflected B61and's judgement about likely outcomes, or, if 
there were no basis for such judgements, equal conditional probabilities. 

[Insert Table 7 about here] 

Xa y. "Ohiective" coTrectness of mspnnse on Ttem i. These variables indicate 
whether die choices specified in Xgj arc in fact correct—Tegardless of how they have been 
reached. We refer to these as "objective" responses because they are typically the only 
observations that are available in standard multiple-choice "objective" educational tests. In 
ttiat context they are tiiought of as "noisy" versions of the X^js. The conditional 
probabilities are logical: for "Correct," the choice tiiat happens to be correct for Uiat item is 
assigned one and the other two arc assigned zero; vice versa for "Incorrect," 

Absorbing Evidence 

The construction of the network described in the preceding section exemplifies reasoning 
from causes to effects, as it were. The initial status shown as Figure 5 represents our state of 
knowledge about a new iiidividual from die same population, beliefs about her likely responses to 
the sample items and the optimal stage x level we might expect to observe over tiie course of the 
twenty-item test. Once she begins to respond* we update our knowledge about observed variables 
directly, and about still unobserved variables probabilistically. This section offers some examples 
of how observations update beliefs, particularly witii regard to Xi, "optimal cognitive level," and 
X2, "optimal stage x level." We focus on some interesting contrasts among the strengtii and nature 
of various observations for infenring subjects' cognitive levels. 

Recall that these data provide two distinct pieces of evidence on each item, a response 
choice and an explanation. A first example illustrates a distinction between the value of evidence 
from the two. Figure 7 shows die network after an incorrect response has been observed to Item 
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17. The updated status of X6,i7. the "Structure understood?" variable for Item 17, indicates an 
88% probabilit)' that this occurred because of an insufficient strategy and 12% due to inaccurate 
execution of a sufficient strategy, with probabilities of particular strategy levels shown in X4,i7, 
the "Item strategy" variable for Item 17. Initial bcUefs for cognitive levels 1. 2, and 3 in Xi of 8%, 
45%, and 47% have shifted down to 13%, 54%. and 33% (c.f. Figure 5). Expectations for 
correct responses and understandings of Items 3 and 8 have also been downgraded. Figure 8 
shows the additional updating that occurs if we learn this incorrect response was anived at by a 
strategy at level 3b. the level isomoiphic to the item. Probable explanation for the failure is 20% 
tactical enor. 80% computational error. Belief about overall cognitive level is concentrated on 
Level 3, and expectations for correct responses to remaining items increase beyond their initial 
status. 

[Insert Figures 7 and 8 about here] 
As mentioned above, correct answers to multiple-choice items are typically taken as 
proxies for correct understandings in educational testing. Test developers avoid items with 
high "false positive" rates, or probabilities of correct answers by chance or by incorrect 
reasoning. Figure 9 reveals that Item 17 is just such an item. Of the subjeas who 
responded with the correct choice, fewer than half did so with a strategy that accounted for 
the true structure of the item! In particular, a quarter of the correct responders employed a 
level lb strategy: (3.5) is less juicy than (2.3) because (3,5) has more water. For this 
reason, a correct response on Item 17 shifts beliefs about optimal level upward only 
slightly. A correct explanation, on the other hand, would immediately estabUsh certain 
belief at Level 3. 

In contrast, Item 8 is a good multiple-choice item by test theoretic standards. 
Figure 10 shows that tiie overwhelming majority of subjects who answered correcdy did so 
through a correct understanding of the equivalence-class structure of the item. 
Interestingly, posterior beliefs shift substantially to level 3 even though only a level 2 
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strategy is required for correct understanding. This is because nearly all the subjects whose 
optimal level was 3 understood the structure of Item 8, while less than half of those whose 
optimal level was 2 did. To further identify whetlier a correct responder had level 2 or level 
3 as an optimal cognitive level would require additional information, such as checking the 
Item 8 explanation to see if it employed a level 3 strategy (if not, the probability for level 3 
would be reduced but not eliminated), or presenting a level 3 item not so prone as Item 17 
to false positives (an incorrect response would shift belief to level 2, a correct one to level 
3). We note in passing that the second of these options is conditionally independent of the 
Item 8 choice, given optimal level, whereas the first is not The DAU (Figure 4) indicates 
the potential confounding or overlap of information about optimal level from multiple 
aspects of h response to a given item, due to the presence of the shared "Item strategy" 
variables linking aspects of information from the same item. One avoids 'double 
counting," or overintcrpreting partially redundant information by acting as if it were 
independent, by properly accounting for the inferential stmcture of the observations, as 
demonstrated in this example. 

[Insert Figures 9 and 10 about here] 
The question of which observation to secure next is addressed by a series of "what 
if* expciiments— a preposterior analysis, in Bayesian terminology. At a given state of 
knowledge, one can run througii the values of a yet unobserved variable, summing the 
information (in terms of, say, reduced entropy or decreased loss) at each with weights 
proportional to their predicted probability under current beliefs. The next observation can 
then be selected to be optimal, in terms of, say, reducing expected loss or reducing 
expected entropy for a particular unknown variable. This is a straightforward application 
of statistical decision theory (Raiffa & Schlaifcr, 1961). 
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Cominents on the Example 

This network provides a simple demonstration compared to the range of potential 
applications for probabilistic inference about cognitive student models. It does illustrate, however, 
probabUity-bascd reasoning built around structural relationships among cognitive strategies and the 
qualitatively different states of knowledge under a theory for the acquisition of proportional 
reasoning. 

One of the limitations of this model is that it only provides an explanation of the 
individual's knowledge organization for a single ability. Consequently, one next step in 
development might be broadening the scope of the model to accommodate more than one ability— 
for example, proportional reasoning in a different domain, or something more disparate such as 
spatial visualization or short-term memory capacity. This can be accomplished by analyzing the 
structural relationships among individuals' state of learning in different domains. From the 
cognitive researcher's point of view, an interesting outcome of this study is that it opens up new 
avenues of exploration in the research of mechanisms and/or processes that lead to the construction 
of knowledge. Such efforts might create new perspectives for a test theory based on cognitive 
models. The inferential machinery explored here complements the skiU lattice theory Haertel and 
Wiley (in press) propose as a basis for constructing educational achievement tests. 

A more serious limitation is the treatment of subjects' cognitive state. Optimal level 
was operationalized in the network as the highest strategy level that a subject employed 
during the course of observation. This is appropriate for inferring the likcUhood of a 
subject's highest level in the entire set knowing just a selected subset of responses. It only 
tells the whole story, however, under the assumption that a subject's likelihoods of 
response remained constant over the course of testing—that is, that a subject's toolkit of 
avaUable cognitive strategies remains unchanged during testing. There is evidence that this 
is not the case. Qscs have been observed in which a subject's previously highest level 
strategy proves inadequate for a subsequent item, the subject recognizes its inadequacy. 
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and« in response to the structure of the item, adapts or extends previous strategies or 
devises new concepts and strategies. Indeed, selecting an item most likely to provoke this 
kind of restructuring lies at the essence of cognitive-based instruction (Vosniadou & 
Brewer, 1987)! 

The data from which the inference network described above was constructed would 
support an analysis of this phenomenon, and such work is cuirentiy in progress. Figure 
1 1 sketches one direction in which the network described above might be extended to 
capture key aspects of it Rather than a single variable expressing a subject's cognitive 
status throughout the test, there is a distinct variable for each item presented. Cognitive 
status as it is in effect for Item j depends on the individual's cognitive status as it was 
before the item was presented and on the structure of Item j itself. The probability diat 
assimilation or accommodation may occur from this interplay is expressed in a new 
"cognitive processes" variable. We would expect probabilities of adaptive restructuring to 
be essentially zero when the structure of the item lies below the subject's entering level and 
low when the item structure is far above her entering level, but maximal when the item lies 
just beyond what she has been able to handle up to that point. 

[Insert Figure 1 1 about here] 

Discussion 

A host of practical issues must be addressed in exploring the applicability of 
probability-based inference, via inference networks, to cogidtive assessment. We conclude 
by mentioning a number of them. 

More ambitious student models. The proportional reasoning network discussed 
above has a very simple representation at its deepest level— a single "optimal level" variable 
entailing a class of available concepts and strategies. Our challenge was to dkxIcI the 
structure of uncertain, partially redundant, sometimes conflicting evidence that observations 
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convey about the deep variable. A single deep variable is obviously too simple for many 
practical applications, and we must explore ways to implement student models with many 
descriptors of knowledge structures, multiple strategy options, and metacognitive and/or 
affective variables. 

The assumed completeness of the network. The inference networks we have 
discussed are closed systems, which presume to account for all relevant possibilities; i.e., 
the space of student models is complete. In any application we can hope at best to model 
the key features distinguishing learners, certainly missing differences that will impact 
behavior. These differences are modelled as random variation. How does this affect 
inference? Can we build networics in such a way as to identify unexpected patterns, and to 
minimize resulting inferential errors? 

The nature of student models. Our basic idea is to provide for probabilistic 
reasoning from observations to student models. This idea can be entertained for any type 
of student models, but certainly it will prove more useful for some types of student models 
than others. Characteristics of student models that need to be explored in this coi^nection 
include model grain-size, and tiie distinctions between overiay vs. performance models 
(Ohlsson, 1986), and static vs. dynamic models. 

• Grain-size concerns the level of detail at which to model students. As Greeno 
(1976) points out, "It may not be critical to distinguish between models differing in 
processing details if the details lack important implications for quality of student 
performance in instructional situations, or the ability of students to progress to 
further stages of knowledge and understanding." The grain-size of our example 
was stage x level. A coarser nKxlel would address level only, while a finer model 
might further differentiate strategies within stages within levels. 

• An "overiay** approach to diagnosing knowledge in the context of intelligent 
tutoring systems builds a representation of an expert's knowledge base, and infers 
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from observed behavior where a student's representation falls short (e.g., C. 
Frederiksen & Breuleux, 1989). A "perfonnance moder* attempts to specify 
correct and/or incorrect elements of knowledge and application rules in sufficient 
detail to solve the same problems the student is attempting (e.g., VanLehn, 1990). 
Our example was a probabilistic version of a simple performance model, as it 
provides predictions of response probabilities for all items for subjects at all 
modelled states. 

• Static models assume a constant knowledge structure during the course of data- 
gathering; dynamic models expect, and attempt to model, changes in the leamer 
along the way. The latter is obviously more ambitious, yet critical to applications 
such as ITSs in which learning is expected. White and J. Frederikscn's (1987) 
QUEST system, for example, builds performance models in the domain of simple 
electrical circuits; the process of instruction is viewed as facilitating the evolution of 
models, successively shapinc; student understanding toward Uiat of an expert. 
Kimball's (1982) calculus tutor utilizes an approach that might be generalized: A 
student model is built under an assumption of statis during a problem, but the prior 
distribution for the next problem is modified to reflect the outcome of the experience 
and a reinforcement model. Our example was static; Figure 1 1 sketched one 
possible dynamic extension. 

Decision-making and prediction. In the context of medical diagnosis, Szolovits and 
Pauker (1978, p. 128) point out the necessity of "...introducing some model of disease 
evolution in time, and dealing with treatment, as diagnosis is hard to divorce from therapy 
in any practical sense." In the context of education, we are concerned with learning and 
instniction. llie Bayesian inferential machinery, as a component of statistical prediction 
and decision theory, is natural for this task. What is required is to extend a network to 
prediction and decision nodes, and to incoiporate utilities as well as probabilities into 
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decision rules. Andreassen, Jensen, and Olesen (1990) illustrate these ideas with a simple 
example finom medical diagnosis. We must lay out the analogous extension in networks for 
cognitive assessment. 

Practical tools. While the inference network approach holds promise for tackling 
class of problems in cognitive assessment, we are a long way from routinely engineering 
solutions to particular members of that class. This requires a methodological toolkit of 
generally applicable techniques and well-understood approaches. Building block models 
and heuristics are useful, for example, so that each application need not start from scratch. 
Foundational work on building-block models appears in Schum (1987). Work tailored to 
the kinds of observational settings and the kinds of psychological models anticipated in 
educational applications is required. And since simplifications of reality are inevitable, it is 
important to learn about the consequences of various model violations, and to develop 
diagnostic techniques for detecting serious ones. 

Conclusion 

The modelling approach sketched in this paper was motivated by die fo llowing 
consideration: 

Standard test tiieory evolved as the application of statistical tiieory with a 
simple model of ability that suited the decision-making environment of most 
mass educational systems. Broader educational options, based on insights 
into die nature of learning and supported by more powerful technologies, 
demand a broader range of models of capabilities — still simple compared to 
the realities of cognition, but capturing patterns that inform a broader range 
of instructional alternatives. A new test tiieory can be brought about by 
applying to well-chosen cognitive models die same general principles of 
statistical inference tiiat led to standard test tiieory when applied to tiie 
simple model. (Mislevy, in press). 
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Probabilistic inference about cognitive student models via inference networks provides a 
potential framework for a more broadly based test theory. Exploiting conceptual and 
computational advances in statistical inference, the approach presents an opportunity to 
extend the achievements of model-based measurement to educational problems cast in terms 
of contemporary cognitive and educational psychology. 
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Table 1 

Test Theory Applications with a Cognitive Perspective 



1 . Mislevy and Verhelst's (1990) mixture models for item responses when different 
examinees follow different solution strategies or use alternative mental models, 

2. Falmagne's (1989) and Haertel's (1984) latent class models for Binary SIcills. 
Students are modelled in terms of the presence or absence of elements of skill or 
knowledge, and observational situations demand various combinations of them, 

3 . Masters and Mislevy *s (in press) and Wilson's (1989a) use of the Partial Credit 
rating scale model to characterize levels of understanding, as evidenced by the 
nature of a performance rather than its conccmess. This incorporate into a 
probabilistic framework the cognitive perspective of Biggs and Collis*s (1982) 
SOLO taxonomy for describing salient qualities of perform jices. 

4. Wilson's (1989b) Saltus model for characterizing stages of conceptual 
development, which model parameterizes differential patterns of strengtli and 
weakness as learners progress through successive conceptualizations of a domain. 

5. Yamamoto's (1987) Hybrid CKKiel for dichotomous responses. This model 
characterizes an examinee as cither belonging to one of a number of classes 
associated with states of understanding, or in a catch-all IRT class. The approach is 
useful when certain response patterns signal states of understanding for which 
particular educational experiences are known to be effective. 

6. Embretson's (1985) miilticomponent models integrate item construction and 
inference within a unified cognitive model. The conditional probabilities of solution 
steps given a multifaceted student model are given by statistical stractuics 
developed in IRT. 

7. Tatsuoka's (1989) Rule space analyses uses a generalization of IRT methodology 
to define a metric for classifying examinees based on likely patterns of item 
response given patterns of knowledge and strategies, 
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Table 2 

Parallels between Inference Networks in Medical and Educational Applications 



Medical Annlication 
Observable symptoms, medical tests 

Disease states, syndromes 

Architecture of interconnections based 
on medical theory 

Conditional probabilities given by 
physiological models, empirical data, 
expert opinion 



Educational Anplirafj^^ 

Test items, verbal protocols, 
observers' ratings, solution traces 

States or levels of understanding of 
key concepts, available strategies 

Architecture of interconnections based 
on cognitive and educational theory 

Conditional probabilities given by 
psychological models, empirical data, 
expert opinion 
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Tables 

Stages within Cognitive Levels 



Level 1: Conceptual or preoperational 

a Sole comparison of the number of juice glasses, the attribute in both pairs. 

b Appraisal of the dilution effect of the water on the final taste of juice. From this, 
the order of magnitude became a comparison of the number of water glasses, the 
complerMtit in both pairs. 

c Construction of functional relations between the complementary terms in each pair, 
establishing between relations in the pair of within relations first constructed 

Ijcvel 2: Concrete operational 

a Use of the ratio "one glass of juice for one glass of water" to demonstrate that both 
terms within each pair were equal. 

b Joint multiplication of both terms within a pair or, otherwise, an operation of co- 
multiplication. (Scalar operator; e.g.,"Both drinks taste alike because there is one 
glass of juice for three glasses of water, which defines the ratio 1;3 in both pairs.") 

c Relationships formed between both terms of each pair, when the first term was 
divided by the second. (Functional operator, e.g.,'The ratio of two glasses of juice 
for six glasses of water is the same as one glass of juice for three glasses of water. 
Three times the ratio 1:3 equal three glasses of juice for nine glasses of water. 
Therefore both drinks taste alike.") 

Level 3: Formal operational 

a Either a scalar or functional operator in the between or the within relations. 

b Ratio treatment: The components of the relationships were the attribute and the 
complement. (E.g., "In Mixture A there are three glasses of juice for five glasses 
of water, a ratio of 9: 15. In Mixture B the ratio is 10; 15 juice to water. Therefore, 
Mixture B tastes juicier.") 

c Fraction treatment: the components of the relationships were the attribute and the 
quantity of liquid. (£.g., "In Mixture A, of a total of 8 glasses, 3 contain juice, 
representing a fraction of 15/40. In Mixture B, of a total of 5 glasses, 2 were juice, 
representing a fraction of 16/40. Therefore, Mixture B tastes juicier.") 
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Table 4 

Conditional Probabilities of Stages within Cognitive Levels 







Stage within Level 




Level 


a 


b 


c 


1 


.000 


.612 


.388 


2 


.582 


.345 


.073 


3 


.145 


667 


.188 
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Tables 

Conditional Probabilities of Strategics given Optimal Cognitive Levels 



Strategy Level of Response 



Optimal 


Ud 


la 


lb 


Ic 


2a 


2b 


2c 


3a 


3b 


3c 


\iVClu D) 
la 






00 

•\J\J 


00 


00 


00 

•\/\/ 


.00 


.00 


.00 


.00 


lb 


08 


04 


8ii 


.00 


.00 


.00 


.00 


.00 


.00 


.00 


Ic 


.01 


.01 


.34 


.64 


.00 


.00 


.00 


.00 


.00 


.00 


Za 




07 




39 


21 

• A* X 


00 


.00 


.00 


.00 


.00 






01 


34 


54 


09 


01 

• \/ A 


.00 


.00 


.00 


.00 


In 


ni 


01 




52 


06 

• V/V/ 


01 


01 

• w A 


.00 


.00 


.00 






01 


20 


74 


02 

• V/*' 


01 

• \I A 


01 

• \/ A 


.01 


.00 


.00 


3D 


01 


01 


02 


21 

• ^ X 


02 

• V/m« 


01 

• V/ X 


01 

• \/ A 


01 

• W A 


.71 


.00 




01 


01 


01 


18 

• X O 


02 

• V/^ 


01 

• V/ X 


01 


.01 


.10 


.65 


In 
la 


so 


so 


00 


00 


00 

• \/\/ 


00 


.00 


.00 


.00 


.00 


lb 


01 


04 


.95 


.00 


.00 


.00 


.00 


.00 


.00 


.00 


Ic 


.01 


.02 


.96 


.01 


.00 


.00 


.00 


.00 


.00 


.00 




01 


02 


58 


04 


35 


00 


.00 


.00 


.00 


.00 




01 


02 


32 


01 


31 


.33 


.00 


.00 


.00 


.00 




01 


02 


06 


.01 


.24 


.60 


.06 


.00 


.00 


.00 




01 


02 


11 


.01 


.08 


.74 


.02 


.01 


.00 


.00 


3b 


01 


01 


01 


01 


01 


.41 


.01 


.01 


.52 


.00 




01 

m\l X 


01 


01 


01 


01 


.29 


.01 


.01 


.07 


.57 


(Item 17^ 
la 


50 


50 


00 


00 


.00 


.00 


.00 


.00 


.00 


.00 


lb 


.07 


.01 


.92 


.00 


.00 


.00 


.00 


.00 


.00 


.00 


Ic 


.04 


.01 


.94 


.01 


.00 


.00 


.00 


.00 


.00 


.00 


2a 


.03 


.01 


.43 


.06 


.47 


.00 


.00 


.00 


.00 


.00 


2b 


.01 


.01 


.46 


.01 


.51 


.00 


.00 


.00 


.00 


.00 


2c 


.04 


.01 


.13 


.01 


.50 


.00 


.31 


.00 


.00 


.00 


3a 


.04 


.01 


.12 


.03 


.40 


.00 


.18 


.22 


.00 


.00 


3b 


.01 


.01 


.01 


.01 


.04 


.00 


.01 


.01 


.90 


.00 


3c 


.01 


.01 


.01 


.01 


.01 


.00 


.01 


.01 


.18 


.75 
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Tabled 



Conditional Probabilities of Procedural Analysis given Item Strategies 


Item Strategy 


Success 


Strategic Emn* 


Tactical Enor 


Computational 
Error 


atem3) 
Ud 


.00 


1.00 


.00 


.00 


la 


.00 


1.00 


.00 


.00 


lb 


.75 


.20 


.05 


.00 


Ic 


.98 


.00 


.02 


.00 


2a 


.85 


.00 


.15 


.00 


2b 


.98 


.00 


.01 


.01 


2c 


.97 


.00 


.02 


.01 


3a 


.96 


.00 


.02 


.02 


3b 


.98 


.00 


.01 


01 


3c 


.90 


.00 


.08 


02 


(Item 8) 
Ud 


.00 


1.00 


.00 


.00 


la 


.00 


1.00 


.00 


.00 


lb 


.00 


1.00 


.00 


.00 


Ic 


.00 


1.00 


.00 


.00 


2a 


.00 


1.00 


.00 


.00 


2b 


.98 


.00 


.01 


.01 


2c 


.00 


1.00 


.00 


00 


3a 


.98 


.00 


.01 


.(U 


3b 


.98 


.00 


.01 


01 


3c 


.96 


.00 


.02 


.02 


atem 17) 
Ud 


.00 


1.00 


.00 


.00 


la 


.00 


1.00 


.00 


.00 


lb 


.00 


1.00 


.00 


.00 


Ic 


.00 


1.00 


.00 


.00 


2a 


.00 


1.00 


.00 


.00 


2b 


.00 


1.00 


.00 


.00 


2c 


.00 


1.00 


.00 


.00 


3a 


.70 


.00 


.10 


.20 


3b 


.95 


.00 


.01 


.04 


3c 


.97 


.00 


.02 


.01 
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Tabic? 

Conditional Probabilities of Item 17 Choice given Item Strategies and Procedural Analysis 





Procedural 




Choice 




Strategy 


Analysis 


Mixture A 


Mixtures 


Equal 


Undifferentiated 


Success 


.33 


.33 


.33 


Undifferentiated Strategic Error 


.13 


.12 


.75 


Undifferentiated 


Tactical Error 


.33 


.33 


.33 


Undifferentiated 


Computational Error 


.33 


.33 


.33 


la 


Success 


.33 


.33 


.33 


la 


Strategic Eiror 


.98 


.01 


.01 


la 


Tactical Error 


.33 


.33 


.33 


la 


Computational Error 


.33 


.33 


.33 


lb 


Success 


.33 


.33 


.33 


lb 


Strategic Error 


.23 


.76 


.01 


ID 


T^mm^mmI III II 

1 actical brror 


.33 


.33 


.33 


lb 


Computational Error 


.33 


.33 


.33 


ic 


Success 


.33 


.33 


.33 


Ic 


strategic Error 


.01 


.01 


.98 


ic 


1 actical brror 


.33 


.33 


.33 


ic 


Computational Error 


.33 


.33 


.33 




Success 


.33 


.33 


.33 


2a 


Strategic Error 


.03 


.95 


.02 


2a 


Tactical Error 


.33 


.33 


.33 


2a 


Computational Error 


.33 


.33 


.33 


2b 


Success 


.33 


.33 


.33 


2b 


Strategic Error 


.33 


.33 


.33 


2b 


Tactical Error 


.33 


.33 


.33 


2b 


Computational Error 


.33 


.33 


.33 


2c 


Success 


.33 


.33 


.33 


2c 


Strategic Error 


.01 


.01 


.98 


2c 


Tactical Error 


.33 


.33 


.33 


2c 


Computational Error 


.33 

(continued) 


.33 


. .33 
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Table 7, continued 

Conditional ProbabUities of Item 17 Choice given Item Strategies and Proccdui^ Analysi 





Procedural 
Analysis 




Choice 




Mixture A 


Mixture B 


Equal 


Dd 


Success 


.00 


1.00 


.00 


M 


otrategic hrror 


.33 


.33 


.33 


Ja 


Tactical Error 


.80 


.00 


.20 




v^omputational nrror 


if /\ 

.50 


.00 


.50 


.3D 


ouccess 


.00 


1.00 


.00 




otrategic Error 


•33 


.33 


.33 


3b 






.00 


.50 


3b 


Computational Error 


.38 


.00 


.62 


3c 


Success 


.00 


1.00 


.00 


3c 


Strategic Error 


.33 


.33 


.33 


3c 


Tactical Error 


.90 


.00 


.10 


3c 


Computational Error 


.70 


.00 


.30 
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p(Xi.X2,X3,e) = p(eiX3.X2.Xi)p(X3iX2,Xi) pcxzixopcxo 







p(Xi,X2,X3,e) = p(Xiie)p(X2!e) p(X3ie)p(d) 



Figure 1 

Graphical Representations in the IRT Example 
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Which mixture will be more juicy— A, B, or both the same? 




Figure 2 
A Sample Juice-Mixing Task 



Mixture A Mixture B 



Item 3 



Item 8 



Item 17 




Figure 3 
Three Juice-Mixing Tasks 
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Figure 4 

Graph of the Juice-Mixing Network 
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Figure 5 

Initial Status, with Marginal Probabilities 
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Figure 6 

Status Conditional on Optimal Level = 3b 
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and Item 17 Strategy = Level 3b 
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Figure 9 

Status Conditional on Item 17 Response Chccc = Right 
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